Budget Allocation using Weakly Coupled, Constrained Markov Decision Processes

نویسندگان

  • Craig Boutilier
  • Tyler Lu
چکیده

We consider the problem of budget (or other resource) allocation in sequential decision problems involving a large number of concurrently running sub-processes, whose only interaction is through their gradual consumption of budget (or the resource in question). We use the case of an advertiser interacting with a large population of target customers as a primary motivation. We consider two complementary problems that comprise our key contributions. Our first contribution addresses the problem of computing MDP value functions as a function of the available budget. In contrast to standard constrained MDPs—which find optimal value functions given a fixed expected budget constraint—our aim is to assess the tradeoff between expected budget spent and the value attained when using that budget optimally. We show that optimal value functions are concave in budget. More importantly, in the finite-horizon case, we show there are a finite number of useful budget levels. This gives rise to piecewise-linear, concave value functions (piecewise-constant if we restrict to deterministic policies) with an representation that can be computed readily via dynamic programming. This representation also supports natural approximations. Our model not only allows the assessment of budget/value tradeoffs (e.g., to find the “sweet spot” in spend), but plays an important role in the allocation of budget across competing subprocesses. Our second contribution is a method for constructing a policy that prescribes the joint policy to be taken across all sub-processes given the joint state of the system, subject to a global budget constraint. We cast the problem as a weakly coupled MDP in which budget is allocated online to the individual subprocesses based on its observed (initial) state and the subprocess-specific value function. We show that the budget allocation process can be cast as a multi-item, multiple-choice knapsack problem (MCKP), which admits an efficient greedy algorithm to determine optimal allocations. We also discuss the possibility of online, per-stage re-allocation of budget to adaptively satisfy strict rather than expected budget constraints. ∗This paper is an extended version of a paper by the same title to appear in the Proceedings of the 32nd Conference on Uncertainty in Artificial Intelligence (UAI-16), New York, 2016.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Budget Optimization for Online Advertising Campaigns with Carryover Effects

While it is relatively easy to start an online advertising campaign, proper allocation of the marketing budget is far from trivial. A major challenge faced by the marketers attempting to optimize their campaigns is in the sheer number of variables involved, the many individual decisions they make in fixing or changing these variables, and the nontrivial short and long-term interplay among these...

متن کامل

Budget Allocation in Binary Opinion Dynamics

In this article we study the allocation of a budget to promote an opinion in a group of agents. We assume that their opinion dynamics are based on the wellknown voter model. We are interested in finding the most efficient use of a budget over time in order to manipulate a social network. We address the problem using the theory of discounted Markov decision processes. Our contributions can be su...

متن کامل

Optimal Resource Allocation and Policy Formulation in Loosely-Coupled Markov Decision Processes

The problem of optimal policy formulation for teams of resource-limited agents in stochastic environments is composed of two strongly-coupled subproblems: a resource allocation problem and a policy optimization problem. We show how to combine the two problems into a single constrained optimization problem that yields optimal resource allocations and policies that are optimal under these allocat...

متن کامل

Solving Very Large Weakly Coupled Markov Decision Processes

We present a technique for computing approximately optimal solutions to stochastic resource allocation problems modeled as Markov decision processes (MDPs). We exploit two key properties to avoid explicitly enumerating the very large state and action spaces associated with these problems. First, the problems are composed of multiple tasks whose utilities are independent. Second, the actions tak...

متن کامل

Solving Markov decision processes for network-level post-hazard recovery via simulation optimization and rollout

Computation of optimal recovery decisions for community resilience assurance post-hazard is a combinatorial decision-making problem under uncertainty. It involves solving a large-scale optimization problem, which is significantly aggravated by the introduction of uncertainty. In this paper, we draw upon established tools from multiple research communities to provide an effective solution to thi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016